Grapharizer: A Graph-Based Technique for Extractive Multi-Document Summarization

نویسندگان

چکیده

In the age of big data, there is increasing growth data on Internet. It becomes frustrating for users to locate desired data. Therefore, text summarization emerges as a solution this problem. summarizes and presents with gist provided documents. However, summarizer systems face challenges, such poor grammaticality, missing important information, redundancy, particularly in multi-document summarization. This study involves development graph-based extractive generic MDS technique, named Grapharizer (GRAPH-based summARIZER), focusing resolving these challenges. addresses grammaticality problems summary using lemmatization during pre-processing. Furthermore, synonym mapping, multi-word expression anaphora cataphora resolution, contribute positively improving generated summary. Challenges, redundancy proper coverage all topics, are dealt achieve informativity representativeness. novel approach which can also be used combination different machine learning models. The system was tested DUC 2004 Recent News Article datasets against various state-of-the-art techniques. Use increased accuracy by up 23.05% compared baseline techniques ROUGE scores. Expert evaluation proposed indicated more than 55%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-extractive Multi-document Summarization

In this thesis, I design a Maximum Coverage problem with KnaPsack constraint (MCKP) based model for extractive multi-document summarization. The model integrates three measures to detect important sentences including Coverage, rewards sentences in regards to their representative level of the whole document, Relevance, focuses to select sentences that related to the given query, and Compression,...

متن کامل

Graph-based models for multi-document summarization

University of Ljubljana Faculty of Computer and Information Science Ercan Canhasi Graph-based models for multi-document summarization is thesis is about automatic document summarization, with experimental results on general, query, update and comparative multi-document summarization (MDS). We describe prior work and our own improvements on some important aspects of a summarization system, incl...

متن کامل

Extractive Multi-document Summarization Using Multilayer Networks

Huge volumes of textual information has been produced every single day. In order to organize and understand such large datasets, in recent years, summarization techniques have become popular. These techniques aims at finding relevant, concise and non-redundant content from such a big data. While network methods have been adopted to model texts in some scenarios, a systematic evaluation of multi...

متن کامل

Graph based Extractive Multi-document Summarizer for Malayalam-an Experiment

Multidocument summarization is an automatic process to generate summary extract from multiple documents written about the same topic. Of the many summarization systems developed for English language, the graph based system is found to be more effective. This paper mainly focuses on a multidocument summarizing system for Malayalam Language which follows a graph based approach. The proposed model...

متن کامل

Topical Coherence for Graph-based Extractive Summarization

We present an approach for extractive single-document summarization. Our approach is based on a weighted graphical representation of documents obtained by topic modeling. We optimize importance, coherence and non-redundancy simultaneously using ILP. We compare ROUGE scores of our system with state-of-the-art results on scientific articles from PLOS Medicine and on DUC 2002 data. Human judges ev...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Electronics

سال: 2023

ISSN: ['2079-9292']

DOI: https://doi.org/10.3390/electronics12081895